The Effect of Network Total Order, Broadcast, and Remote-Write Capability on Network-Based Shared Memory Computing1

نویسندگان

  • Robert Stets
  • Sandhya Dwarkadas
  • Leonidas Kontothanassis
  • Umit Rencuzogullari
  • Michael L. Scott
چکیده

Emerging system-area networks provide a variety of features that can dramatically reduce network communication overhead. Such features include reduced latency, protected remote memory access, cheap broadcast, and ordering guarantees. In this paper, we evaluate the impact of these features on the implementation of Software Distributed Shared Memory (SDSM), and on the Cashmere system in particular. Cashmere has been implemented on the Compaq Memory Channel network, which supports remote memory writes, inexpensive broadcast, and total ordering of network packets. We evaluate the performance impact of these special network features on the three kinds of SDSM protocol communication: shared data propagation, protocol metadata maintenance, and synchronization, using an 8-node, 32-processor system. Among other things, we compare our base protocol, which leverages all of Memory Channel’s special features, to a protocol based solely on reliable point-to-point messages. We found that the special features improved performance by 18–44% for three of our applications, but less than 12% for our other seven applications. The message-based protocol has the added benefit of allowing shared memory size to grow beyond the addressing limits of the network interface. Moreover, it enables us to implement a home node migration optimization that sometimes more than offsets the advantages of the protocol that fully leverages the Memory Channel features, improving performance by as much as 67%. These results suggest that for systems of modest size, low latency is much more important for SDSM performance than are remote writes, broadcast, or total ordering. At the same time, results on an emulated 32-node system indicate that broadcast based on remote writes of widely-shared data may improve performance by up to 56% for some applications. If hardware broadcast or multicast facilities can be made to scale, they can be beneficial in future system-area networks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Effect of Network Total Order, Broadcast, and Remote-Write Capability on Network-Based Shared Memory Computing

Emerging system-area networks provide a variety of features that can dramatically reduce network communication overhead. In this paper, we evaluate the impact of such features on the implementation of Software Distributed Shared Memory (SDSM), and on the Cashmere system in particular. Cashmere has been implemented on the Compaq Memory Channel network, which supports low-latency messages, protec...

متن کامل

HiperTM: High Performance, Fault-Tolerant Transactional Memory

We present HiperTM, a high performance active replication protocol for fault-tolerant distributed transactional memory. The active replication paradigm allows transactions to execute locally, costing them only a single network communication step during transaction execution. Shared objects are replicated across all sites, avoiding remote object accesses. Replica consistency is ensured by a) OS-...

متن کامل

Application of self organizing maps for investigating network latency on a broadcast-based distributed shared memory multiprocessor

Broadcast-based DSMmultiprocessors are nowadays an attractive platform for parallel computing due to their advantages in terms of scalability and programmability. In order to obtain high performance out of these systems, network latency reduction techniques should be developed, which requires the knowledge of the relationship between latency and other important DSM parameters. In this paper, se...

متن کامل

A CC-NUMA Prototype Card for SCI-Based PC Clustering

It is extremely important to minimize network access time in constructing a high-performance PC cluster system. For an SCI-based PC cluster, it is possible to reduce the network access time by maintaining network cache in each cluster node. This paper presents a CCNUMA card that utilizes network cache for SCI-based PC clustering. The CC-NUMA card is directly plugged into the PCI slot of each no...

متن کامل

Integration of remote sensing and meteorological data to predict flooding time using deep learning algorithm

Accurate flood forecasting is a vital need to reduce its risks. Due to the complicated structure of flood and river flow, it is somehow difficult to solve this problem. Artificial neural networks, such as frequent neural networks, offer good performance in time series data. In recent years, the use of Long Short Term Memory networks hase attracted much attention due to the faults of frequent ne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000